The goal of this project is to compare and contrast the responses of plants and insects to environmental gradients. It uses the following data:
We have two plant datasets. plant_pa excludes sites classified as SOWWs; plant_pa2 includes sites classified as SOWWs. For the moment, let’s proceed using the df which includes SOWWs. Add Natural Region designation to each site.
Summary of the distribution of sites:
veg_pa %>% distinct(Protocol,WetlandType,Site, NRNAME) %>% group_by(Protocol) %>% tally()
## # A tibble: 2 x 2
## Protocol n
## <chr> <int>
## 1 Terrestrial 557
## 2 Wetland 1027
veg_pa %>% distinct(Protocol,WetlandType,Site, NRNAME) %>% group_by(WetlandType) %>% tally()
## # A tibble: 7 x 2
## WetlandType n
## <chr> <int>
## 1 Bog 360
## 2 Marsh 424
## 3 Poor Fen 259
## 4 Rich Fen 66
## 5 Shallow Lake 232
## 6 Swamp 63
## 7 Wet Meadow 180
veg_pa %>% distinct(Protocol,WetlandType,Site, NRNAME) %>% group_by(NRNAME) %>% tally()
## # A tibble: 6 x 2
## NRNAME n
## <fct> <int>
## 1 Boreal 959
## 2 Canadian Shield 43
## 3 Foothills 114
## 4 Grassland 314
## 5 Parkland 131
## 6 Rocky Mountain 23
Exclude taxa which have not been ID’d to species (i.e. those that have been ID’d to genus or to subsepcies). Exclude taxa not ID’d to species.
Load and examine the available climate data. The histogram below shows the distribution of climatic variables for each focal site.
There are 25 sites w/o climate data. Must exclude them for now.
## # A tibble: 0 x 2
## # … with 2 variables: Protocol <chr>, n <int>
Bin the continuous climatic variables into 10 bins each with a similar number of sites. Assign each site to a bin.
We can examine the of occurrences of each species in each climate bin. To do so, we join the vegetation df (veg_pa) to the climate df (clim2), sum the number of occurrencs in each bin, and plot a histogram. For example, below we can see the occurence frequency distribution of Typha latifolia across a gradient of CMD (Climatic Moisture Defecit).
Now calculate the occurrence frequency distribution for every species, across each climatic gradient. We will exclude species which occur only 1x, since they will have high sensitivity to the gradient.
Also calculate the specicies sensitivity index (SSI), the coeffient of variation (sd/mean) of each species’s occurrence frequency distribution across each climatic gradient. This df is called sp_SSI.
## # A tibble: 6 x 6
## Species CV_FFP CV_MAP CV_MAT CV_SumPrecip CV_CMD
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abies balsamea 0.551 0.835 1.15 0.868 0.551
## 2 Abies bifolia 1.32 3.16 1.94 3.16 1.32
## 3 Acer negundo 1.62 1.41 2.41 2.38 1.62
## 4 Achillea alpina 0.764 0.716 1.19 0.831 0.764
## 5 Achillea millefolium 0.230 0.179 0.321 0.163 0.230
## 6 Acorus americanus 1.36 1.29 1.65 1.42 1.36
Add SSI of each species (sp_SSI) to the vegetation df (veg_pa). Calculate the mean SSI of species at each site. Add the climate df (clim2).
Compare the distribution of CSI values for wetland vs terrestrial sites.
The Wetland protocol systematically underestimates the CSI of communities across all climatic gradients. Generally, though, communities at the extremes of each climate gradient show higher sensitivity to climatic conditions; these communities are composed to species with high fidelity to the environmental conditions.
Peatland communities are relatively insensitive to climatic conditions whereas marshes, SOWWs, and wet meadows show positive correlation between CSI and climatic gradients. However, peatlands (bogs and fens) occur more often in the boreal and the other wetland types occurr in grasslands, so we are confounding wetland type with Natural Region and latitude.
There is a difference in the species richness of site sampled with the Wetland and Terrestrial protocols. There are also differences in the number of sites sampled.
## # A tibble: 2 x 2
## Protocol n
## <chr> <int>
## 1 Terrestrial 773
## 2 Wetland 1280
Sites sampled with the wetland protocol have lower species richness than those sampled with the terrestrial protocol, bu there is considerable overlap.
There is considerable overlap in the species richness of different wetland classes, although it is difficult to really tell because the lines are messy.
Natural regions differ a good bit in climatic conditions.
To compare the sensitivity of species captured in sites from Terrestrial vs. Wetland protocols, try calculating SSI based only on the distribution of species from sites in one protocol. First calculate SSI based on occurrence freqency of terrestrial sites. Then calculate CSI of terrestrial and wetland sites. Could be interesting to see if the resulting CSI shows wetland sites generally with higher or lower or the same CSI values relative to terrestrial sites.
## # A tibble: 6 x 6
## Species CV_CMD CV_FFP CV_MAP CV_MAT CV_SumPrecip
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abies balsamea 0.872 0.603 0.748 0.817 0.821
## 2 Abies bifolia 3.16 1.41 3.16 2.54 3.16
## 3 Acer negundo 2.83 2.16 2.49 1.70 1.89
## 4 Achillea alpina 0.612 0.456 0.898 0.866 0.924
## 5 Achillea millefolium 0.166 0.203 0.183 0.288 0.170
## 6 Acorus americanus 3.16 3.16 3.16 3.16 3.16
There are some slight differeences in the distribution of CSI values between wetland and terrestrial sites when CSI is calculated based on occurrence frequency of terrestrial sites only, but wetland sites don’t have systematically higher or lower CSI than terrestrial sites across diff gradients.
Using only species from terrestrial sites to create SSI results in pretty similar patterns of CSIs of Wetland and Terrestrial sites across climatic gradients. Compare this graph to the one created from SSI of both wetland and terrestrial sites (in section 4.2).
There are some qualitative differences between using the Terrestrial or Wetland protocol sites to calculate SSI. Using terrestrial and wetland sites to calculate SSI, the CSI of sites sampled with the wetland protocol pretty consistently always have lower CSI than terrestrial sites (i.e. wetland sites are less sensitivie than terrestrial sites). When SSI is calculated using only terrestrial sites, however, sites sampled with the wetland protocol show higher sensitivity under moderate climatic condition whereas terrestrial sites show a strong reduction in sensitivity under moderate climatic conditions. That is, wetland sites show relatively more consistent CSI then terrestrial sites.
Now calculate SSI based on occurrence freqency of wetland sites.
## # A tibble: 6 x 6
## Species CV_CMD CV_FFP CV_MAP CV_MAT CV_SumPrecip
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Abies balsamea 1.18 1.18 1.25 1.31 1.36
## 2 Abies bifolia 1.61 1.61 3.16 1.61 3.16
## 3 Acer negundo 2.25 2.25 2.25 2.25 2.25
## 4 Achillea alpina 1.17 1.17 1.08 1.78 1.07
## 5 Achillea millefolium 0.369 0.369 0.288 0.561 0.213
## 6 Acorus americanus 1.76 1.76 1.44 1.48 1.08
The distributions of CSI values don’t really differ between terrestrial and wetland sites when SSI is calculated with wetland data only. However we can see that there are much fewer terrestrial sites for which we can assign CSI values, probably b/c there are many species in the wetland dataset which are not found in the terrestrial dataset.
Using only sites from the Terrestrial or Wetland protocols to calculate SSI (below, based on MAT) underestimates the SSI. This underestimation is more severe for the more sensitive species (i.e. at higher SSI values). Nonetheless, there is a strong positive correlation between SSI calculated with both protocols and with each seperate protocol.
How can we chose the most appropriate variablesl to describe community sensitivty? Use regression trees and/or random forest. To predict the community sensitivity index (CSI), use the following predictors
Regression tree predicting CSI - MAT
Now grow a random forest to compare the importance of each predictor.
## IncNodePurity
## Protocol 7.180321
## NRNAME 22.391886
## WetlandType 2.233150
## spR 4.055290
## MAT 17.397950
## MAP 4.835931
## FFP 7.465682
## SumPrecip 3.874510
## CMD 17.485597
## [1] 0.821481
The final tree’s pseudo-R2 = 0.82.